.. _Build NeurEco Compression model with the Python API:

Build NeurEco Compression model with the Python API
#####################################################

To build a NeurEco Compression model in Python API, import **NeurEcoTabular** library:

.. code-block:: python

  from NeurEco import NeurEcoTabular as Tabular

Initialize a NeurEco object to handle the **Compression** problem:

.. code-block:: python

  model = Tabular.Compressor()

Call method **build** with the parameters set for the problem under consideration:

.. code-block:: python

  model.build(input_data,
    validation_input_data=None,
    write_model_to="",
    write_compression_model_to="",
    write_decompression_model_to="",
    compress_tolerance=0.01,
    valid_percentage=33.33,
    use_gpu=False,
    inputs_scaling=None,
    inputs_shifting=None,
    inputs_normalize_per_feature=None,
    minimum_compression_coefficients=1,
    compress_decompress_size_ratio=1,
    start_build_from_model_number=-1,
    freeze_structure=False,
    initial_beta_reg=0.1,
    gpu_id=0,
    checkpoint_to_start_build_from="",
    checkpoint_address="",
    validation_indices=None,
    final_learning=True)


:input_data: numpy array, required. Numpy array of training input data. The shape is :math:`(m,\ n)` where :math:`m` is the number of training samples, and :math:`n` is the number of input features.
:validation_input_data: numpy array, optional, default = None. Numpy array of validation input data. The shape is :math:`(m,\ n)` where :math:`m` is the number of validation samples, and :math:`n` is the number of input features.
:write_model_to: string, optional, default = None. Path where the model will be saved.
:write_compression_model_to: string, optional, default = None. Path where the compression component of the model will be saved.
:write_decompression_model_to: string, optional, default = None. Path where the compression component of the model will be saved.
:compress_tolerance: float, default=0.01, specifies the tolerance of the compressor: the maximum error accepted when performing a compression and a decompression on the validation data.
:validation_indices: numpy array or list, optional, default = None. List of indices of the samples to be used as validation samples, in the training data. If the value is not None, the field valid_percentage will not be used. The lowest accepted index is 1, while the highest is the number of samples.
:valid_percentage: float, optional, default is 33.33%. Percentage of the data that NeurEco will select to use as validation data. The minimum value is 10%, the maximum value is 50%. Ignored when **validation_indices** or **validation_input_data** and **validation_output_data** are provided.
:use_gpu: boolean, optional, default is False. True if GPU will be used for the build.
:inputs_scaling: string, optional, default = 'auto'. Possible values: 'max','max_centered', 'std', 'auto', 'none'. See :std:ref:`Normalizing the data Tabular Compression` for more details.
:inputs_shifting: string, optional, default = 'auto'. Possible values: 'mean', 'min_centered', 'auto', 'none'. See :std:ref:`Normalizing the data Tabular Compression` for more details.
:inputs_normalize_per_feature: bool, optional, default = True. See :std:ref:`Normalizing the data Tabular Compression` for more details.
:minimum_compression_coefficients: int, optional, default=1, specifies the minimum number of nonlinear coefficients when reached NeurEco stops the reducing the number of neurons for the compression layer.
:compress_decompress_size_ratio: float, optional, default is 1.0 specifies the ratio between the sizes of the compression block and the decompression block. This number is always bigger than 0 and smaller or equal to 1. Note that this ratio will be respected in the limit of what NeurEco finds possible.
:start_build_from_model_number: int, default = -1, When resuming a build, specifies which intermediate model in the checkpoint will be used as starting point. when set to -1, NeurEco will choose the last model created as starting point. The model numbers should be in the interval [0, n[ where n is the total number of networks in the checkpoint.
:freeze_structure: bool, default = False, When resuming a build, NeurEco will only change the weights (not the network architecture) if this variable is set to True.
:initial_beta_reg: float, optional, default = 0.1. The initial value of the regularization parameter.
:gpu_id: int, optional, default is 0. id of the GPU card to use when use_gpu=True and multiple cards are available.
:checkpoint_to_start_build_from:  default = "", path to the checkpoint file. . When set, the build starts from the already existing model (for example, while using the same data, when the previous build has stopped for some reason; or by using additional/different data or settings)
:checkpoint_address: string, optional, default = "". The path where the checkpoint model will be saved. The checkpoint model is used for resuming the build of a model, or for choosing an intermediate network with less topological optimization steps.
:validation_indices: numpy array or list, optional, default = None. List of indices of the samples to be used as validation samples, in the training data. If the value is not None, the field valid_percentage will not be used. The lowest accepted index is 1, while the highest is the number of samples.
:final_learning: boolean, optional, default = True. If set to True, NeurEco includes the validation data into the training data at the very end of the learning process and attempts to improvement the results.

:return: set_status: 0 if ok, other if not

.. _Normalizing the data Tabular Compression:

Data normalization for Tabular Compression
===========================================

.. include:: ../../CommonParts/NormalizationTabularPerFeaturePythonCompression.rst

.. include:: ../../CommonParts/NormalizationTabular.rst


Particular cases of Build for a Tabular Compression
===========================================================

.. _Select a model from a checkpoint and improve it Compression Python API:

Select a model from a checkpoint and improve it
------------------------------------------------

.. include:: ../../CommonParts/Choose model and apply final learning python part1.rst
It is possible to export the chosen model as it is from the checkpoint, see :std:ref:`Export NeurEco Compression model with the Python API`.

.. include:: ../../CommonParts/Choose model and apply final learning Python part2.rst

.. _Control the size of the NeurEco Compression model during Build python:

Control the size of the NeurEco Compression model during build
----------------------------------------------------------------

| It is possible to balance the number of links between the compressor and the decompressor parts of the neural network using the parameter **compress_decompress_size_ratio**. The decompressor being in general more complex than the compressor, this ratio has to be less or equal to one and greater than zero. 
| This option is particularly useful for IoT applications. It is possible to compress a sequence of measurements collected by a sensor and to reduce the quantity of transmitted data. The reduction of radioelectric transmissions reduces battery consumption and extends the autonomy of the IoT device. It can be seen in the figure below that that the quantity of transmitted data is reduced by a factor of 13. Indeed 210 measurements are replaced by 16 compression coefficients. 

This figure shows the compression/decompression models generated by a standard NeurEco Tabular compression. The number of links is well balanced between compression and decompression. However, if the user chooses to, that balance could be shifted to create a smaller compressor like shown in the figure below:

.. figure:: ../../../../images/SolutionsMbedCompression.png
    :width: 800
    :alt: SolutionsMbedCompression
    :align: center

    Controlling the size of a compression model
	
.. note::
    The size of the compressor running on a microcontroller is reduced, while the size of the decompressor is increased
   

For a detailed example of the usage of this option, see :std:ref:`Tutorial Control the size of a Compression model Python API`.